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ABSTRACT 

We study the spatial distribution of faint satellites of intermediate redshift (0.1 < z < 0.8), early- 
type galaxies, selected from the GOODS fields. We combine high resolution HST images and state- 
of-the-art host subtraction techniques to detect satellites of unprecedented faintness and proximity 
to intermediate redshift host galaxies (up to 5.5 magnitudes fainter and as close as 0'.'5/2.5 kpc to 
the host centers). We model the spatial distribution of objects near the hosts as a combination 
of an isotropic, homogeneous background/foreground population and a satellite population with a 
power law radial profile and an elliptical angular distribution. Wc detect a significant population 
of satellites (iVg = 1.7^Q g) that is comparable to the number of Milky Way satellites with similar 
host-satellite contrast. The average projected radial profile of the satellite distribution is isothermal 
(7p = —1.0^04), which is consistent with the observed central mass density profile of massive early- 
type galaxies. Furthermore, the satellite distribution is highly anisotropic (isotropy is ruled out at a 
>99.99% confidence level). Defining (j) to be the offset between the major axis of the satellite spatial 
distribution and the major axis of the host light profile, we find a maximum posterior probability of 
(j) = and |(^| less than 42° at the 68% confidence level. The alignment of the satellite distribution 
with the light of the host is consistent with simulations, assuming that light traces mass for the host 
galaxy as observed for lens galaxies. The anisotropy of the satellite population enhances its ability to 
produce the fiux ratio anomalies observed in gravitationally lensed quasars. 

Subject headings: dark matter. Galaxies: dwarf. Galaxies: formation. Galaxies: halos. Gravitational 
lensing: strong. Gravitational lensing: weak 



1. INTRODUCTION 

[ Cold Dark Matter (CDM) simulations have been suc- 
cessful at predicting the distribution of matter on super- 
galaxy scales (greater than a few Mpc), but less suc- 

[ cessful on smaller scales, predicting orders of magnitude 
more satellite galaxies th an are observed in the local 
group (jStrigari et al.ll200"7l ). The solution to the so-called 

[ missing satellite problem will have important implica- 
tions for dark matter cosmo logy and st ar forma.tion i n 
low mass halos (|Klvpin et a l. 1999; Moo re et al.lll999f ). 
The discrepancy between observed and predicted satel- 
lite numbers could be due to incorrect assumptions about 
the nature of dark matter in the cosmological simula- 
tions. For instance, it has been proposed that small 
scale structure does not form because dark matter is 
not actually "cold" bu t instead has appreciable velocity 
(e.g. IColm et all I2000f ) . or that the power spectrum of 
density fluctuations used in si i nulations is incorrect (e.g . 
IKamionkowski fc Liddi^[2000t IZentner fc Bullockll2003l ). 
Alternatively, it is possible that the simulations are cor- 
rect but that very low mass satellites are too faint to be 
detected. The luminosity of satellites is difficult to pre- 
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diet theoretically as high resolution CDM simulations do 
not include baryonic interactions and therefore neglect 
how processes such as UV reionization, baryonic cool- 
ing, supernovae, and g as accretion affect star formation 
in da rk matter halos (Thoul fc Weinb erg"1996t iGnedinI 
l2000t [Kaufm ann et al. I2008| ). Hydrodynamical simula- 
tions which include gas and radiative transfer have been 
able to reproduce some aspects of the observed luminos- 
ity a nd metallicity properties of galaxies (see iSpringell 
I2OIOI and references therein). However, these simula- 
tions must be improved in order to attain the complete 
and self-consistent understanding of star formation which 
is needed to predict the luminosity function of satellite 
galaxies. 

While the luminosity and mass functions of satellites 
can be used to constrain cosmology and star forma- 
tion, their spatial distribution can be used as a tracer 
of the total mass distr ibution of galaxy halos. Numer- 
ical simulations (iKravtsovll2010[ ) predict that satellites 
should be distributed as the mass profile of the host 
galaxy which, neglecting baryons, is expected to be ap- 
proximately a Navarro et al.l ()1996l hereafter NEW) pro- 
file. However, the inner region of the host halo pro- 
file is likely to be steeper in the presence of b aryons 
fe.g.. lBlumenthal et al.lll986l : iGnedin et al.ll2004[ ). From 
gravitational lensing techniques, the total central mass 
density distributions of massive early-type galaxies have 
been shown to follow approximately isothermal profiles 
(i.e., pjr) (X r~^; [Cp hn et al. 2001; Trcu & Koopmani 
[2001 IKoopmans et al.l l2006l 120091: lAuger et al.l l20lC 
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The two point correlation function of luminous red galax- 
ies (limiting magnitude ^ L*) in the Sloan Digital Sky 
Survey (SDSS) has been shown to be consistent with 
an isothermal satellite distribution ("Wat son et alJ[2010l 
hereafter WIO), ahhough ^Chcn (2008) (hereafter COS) 
studied fainter satellites with a limiting magnitude of 
about O.IL* and found that these satellites followed 
a distribution closer to NFW. This apparent discrep- 
ancy in the form of the radial profile of satellites is 
likely due to the differences in how line-of-sight inter- 
lopers are treated in the two studies and the differ- 
ent radial ranges that were probed. The angular pro- 
file of satellites is also expected to follow the host mass 
distribution, and simulations predict that satellites will 
have an anisotropic spatial distribution, appearing pref- 
erentiall y along the m ajor axis of the host mas s pro- 
file (e g. IZentner et al.ir2005- .Zentner 2006: Knebe et alJ 
[2?)0llAubert et alJl2004l : lFaltenbacher'eral1l2007ll2008ir 
This anisotropy has b een observed in the local group 
(e.g. iMetz et aLl I2009D and in SPSS galaxy distribu- 
tions (Braincrd"2005VYang eliDIlOOi iWang et al.ll2Q08l: 
[Agustsson & Braincrd 2010). 

Furthermore, the abundance of bright satellites (typ- 
ically brighter than about 0.1 L*) near hosts, also 
known as close pairs, has been used to constrain the 
rate of major mergers given assumptions about merger 
timescales and st ellar mass to vir ial mass ratios (e.g. 
Beh et al.l [20061: lPa.tton k Atfieldl [2 008: B undv et al.l 



20091: iRobaina et al.il2010l : iLe Fevre et al. 2000). Merg- 



ers are believed to play a key role in gal a xy gr owth. 
For instance, simulations by iHopkins" etall (|2010l ) indi- 
cate that about 30 percent of the mass in galaxy bulges 
comes from minor mergers (mergers where the mass ra- 
tio between merging objects is less than 1/3). Further- 
more, observations and simulations have shown that mi- 
nor mergers may play a dominant role in determining 
the rece nt (z< 1) star formation activity in early-type 
galaxies (jKavirai et al.ll20d9l 120111 ). One would therefore 
like to study the m erger rates at lower virial and stellar 
mass scales (see also lBundv eTaL|[2007l : iNaab et al.|[2009l: 



iFakhouri et al.ll2010[ ). although this requires finding and 
studying faint satellites at cosmological distances. 

iJackson et al.l ()2010l hereafter JIO) have used high reso- 
lution Hubble Sp ace Telescope (HST) imaging from the 
COSMOS survey (jScoville et al.ll2007l ) in order to search 
for satellites around approximately 11,000 massive galax- 
ies. They searched for companion objects within 2" from 
their hosts, although the closest objects remained unde- 
tected as can be seen, for example, in the third panel of 
Figure 3 of their paper. They compared their findings 
to the number of satellites that have been found around 
22 CLASS lens systems and found that the lens systems 
were more likely to have a companion object than a typ- 
ical COSMOS elliptical galaxy. 



The efforts of IJackson et al.l (|2010f) highlight the diffi- 
culties of studying satellites at higher redshifts, where 
decreasing physical resolution reduces the number of 
faint companions that can be observed near bright hosts. 
However, one successful method of studying the mass and 
positional properties of high redshift satellites has been 
observations and simulations of gravitationally lensed ac- 
tive galactic nuclei (AGN) fe.g. lDalal fc Kochanekll2002l : 



IMcKean et alll200l IVegetti et al]l2009t IXu et al.ll2009l) . 

This method shows great promise with regards to the 
missing satellite problem because the brightness of the 
lensed images is very sensitive to local perturbations of 
the deflector potential, such as those expected from sub- 
structure. Thus, mass models which reproduce lensed 
image magnitudes can be used to infer the presence of 
satellites regardless of whether or not the satellites are 
luminous. However, there are two main difficulties with 
using lensed quasars to study satellites. The first is the 
small number of suitable strong lens systems currently 
know n and the complexity of their selection function 
(e.g.. iTreul 120101 and references therein). This limita- 
tion will become less important with new surveys such 
as the Large Synoptic Survey Telescope (LSST) which 
will vastly i ncrease the number of detected quasar lenses 
(jOguri fc Mar shall 2010). A second and more significant 
difficulty with using Icnsing to study satellite properties 
is that bright lensed images make it difficult or impos- 
sible to measure the lumi nosity of perturbing satellites 
(e.g. iMcKean et al.l |2005[) . although this will be miti- 
gated by future telescopes with higher resolution than 
what is currently available. In addition to AGN, re- 
cent work has shown that dark satellites can be detected 
by reconstructing the le nsing potential using extended 
background sources (e.g.. lKoopmansl[2005HVegetti et al.l 
[^09, 2010), further enlarging the sample of gravitational 
lenses that can be used to detect substructure. Fur- 
thermore, substructure might be detectable using de- 
tailed positional and time-delay measure ments of lensed 
images rather than just magnification ([MacLeod et al.l 
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thus further addressing the first limitation. 



A powerful approach to understanding star formation ef- 
ficiency in low mass satellite galaxies is the combina- 
tion of lensing studies to constrain the mass function of 
satellites, and imaging studies to c onstra in the luminos- 
ity function (jTreu 2010; Kravtsovl 120101 and references 
therein). With this goal in mind, we have started a new 
program to characterize the visible properties of faint 
satellites of massive galaxies. In this first paper, we fo- 
cus on the spatial distribution of faint satellites of early- 
type galaxies at intermediate redshifts, 0.1 < z < 0.8 , 
selected from the GOODS fields (jGiavalisco et al.l l2004). 
We concentrate on this population of hosts because early- 
type gal axies dominate th e sample of strong lensing 
galaxies ([Auger et al.ll2009l ) and are therefore the proper 
host sample for comparison to the satellite mass func- 
tion results from lensing studies. An additional benefit 
to studying early-type galaxies is that they have rela- 
tively smooth surface brightness profiles which are ideal 
when searching for nearby compact and faint compan- 
ions. 

For this initial analysis we only require one photomet- 
ric band. We use zsso because it is the reddest of the 
GOODS bands and therefore it is the most faithful tracer 
of stellar mass. In forthcoming papers of this series we 
will use multiple band photometry to improve line-of- 
sight interloper removal and constrain the luminosity 
and stellar mass functions of the satellites (Nierenberg 
et al. 2011a, in preparation; hereafter paper II), and we 
will combine our results with the total mass function ob- 
tained from strong lensing studies to constrain the total- 
to-stellar mass ratio of the satellites (Nierenberg et al. 
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2011b, in preparation; hereafter paper III). 

This paper is organized as follows: In ij2]we discuss the 
properties of our host galaxy sample. In ^ we summa- 
rize our image analysis, including our elliptical B-spline 
host galaxy subtraction and faint object detection and 
photometry methodologies. In fJH we take a first look 
at the satellite distribution by means of a binned anal- 
ysis, which is useful for visualizing the main trends and 
identifying the strength of the signal. In fj5]we describe 
our model for the combined satellite plus background 
object spatial distribution and the parameters that we 
aim to constrain (average number of satellites, power law 
slope, etc.) along with a number of nuisance parameters 
(density of the background population, slope of the back- 
ground number counts, etc.). In §[B]we present the results 
obtained by comparing our model to the data. In §[7] we 
discuss our results and compare them to previous satel- 
lite studies. In §[5] we provide a concise summary. The 
appendix contains more detailed explanations of many of 
the methods we used in this paper. 

Throughout this paper, we assume a flat ACDM cos- 
mology with h — 0.7 and 1 7ni = 0-3. All magnitudes 
are given in the AB system (|Okelll97^ unless otherwise 
stated. 

2. HOST GALAXY SAMPLE 

We select a population of early-type (E and S O) host 
galaxies from the catalog of iBundv et al.l (|2005l ). which 
contains spectroscopic redshifts, stellar mass estimates, 
and morphological cl assifications for 47% of its objects 
(see lTreu et al.l [20051 ) . The objects in this catalog were 
originally selected from Hubble Space Telescope photo- 
metric catalogs made by the GOODS team u sing the 
SExTRACTOR software (jBertin k Arnoutslll996[ ). In the 
GOODS-Sout h field we use COMBO-17 photometric 
redshifts ( Wolf et al.l 120041 ) for our hosts where spectro- 
scopic redshifts are not available. In the GOODS-North 
field only a handful of objects have morphological classi- 
fications and stellar mass estimates but no spectroscopic 
redshifts. We excluded these from our analysis. 

We limit the host redshifts to 2 < 0.8 to guarantee 
that we can detect satellites with host-satellite luminos- 
ity contrasts fainter than the equivalent contrast between 
the Small Magellanic Cloud and the Milky Way at all 
redshifts, thus increasing the likelihood that we observe 
approximately one satellite per host. We also exclude 
2: < 0.1 galaxies, which are few (owing to the small vol- 
ume) and too extended in angular size to analyze in the 
same way as the more distant sample. Finally we exclude 
two hosts which appear to be undergoing major mergers 
as these have physical environments which are signifi- 
cantly distinct from the majority of our sample. To en- 
sure that we study the same population of satellites for all 
hosts, despite their broad distribution in redshift, we only 
consider satellites brighter than a fixed fraction (Am) of 
the host galaxy luminosity (i.e. a fixed difference in mag- 
nitude). We also require that all objects be brighter than 
the detecti on t hreshold (z85o= rrimax = 26.5, as described 
in Section [3T2]) . and therefore we cut the parent sample 

® The GOODS catalogs are available at 
http : //archive . stsci . edu/prepds/goods 



to a maximum value of zgso, depending on the choice of 
Am, such that mhost + Am < mmax- 

As we increase the size of the magnitude range we study, 
the number of hosts that are complete within that mag- 
nitude range drops; there are 202, 127 and 71 hosts com- 
plete to Am — 6.0, 5.5, and 5.0 respectively in the final 
host sample. At the same time, the number of satellites 
per host is expected to increase with Am. The opti- 
mal choice of Am needs to strike a balance between the 
two effects. As we will show, for the present GOODS 
dataset we find that Am=5.5 maximizes the signal to 
noise ratio of the detection and therefore we will adopt 
this choice as our default. To illustrate the robustness of 
our results to small changes in Am, we will also describe 
our findings for Am = 5 and 6. The distributions of 
host redshifts, absolute magnitudes, and stellar masses 
for the three choices of Am are shown in Figure [T] As 
Am becomes larger, the hosts that satisfy the complete- 
ness requirements tend to become brighter and shift to 
lower redshift. As with any fiux limited sample, the lower 
redshift objects of the sample will be dominated by the 
more abundant, intrinsically fainter galaxies, while the 
higher redshift objects will be dominated by intrinsically 
brighter objects. We defer an analysis of evolutionary 
trends to papers II and III where we will also study the 
luminosity and stellar mass properties of the satellites. 
Results in this paper are an average of the properties 
of the satellite population in the 0.1 < z < 0.8 redshift 
range. 

We use the second order moment of the host intensity 
distribution, measured along the major axis by SEx- 
TRACTOR Q as a scale factor which we represent by 
i?h- ^h compensates not only for varying angular size 
with redshift, but is also intrinsic ally related to host 
masses via the size-mass relation (e.g. lTrujillo et al.ll2006l : 
iWilliams et al.ll20Tol ) and thus adjusts for variations in 
host masses at a given redshift. i?h varies across the 
host sample in physical size (0.8-5.5 kpc) with host mass 
variations and in angular size (between O'.'2-l'.'l). The 
median (and modal) values of i?h are 0'.'5 and 3 kpc. We 
use these as fiducial values when converting our results 
to angular and physical scales. 

3. DETECTION AND PHOTOMETRY OF CLOSE 
NEIGHBORS 

Companions of high redshift galaxies are difficult to 
study because they are intrinsically faint and often ob- 
scured by the host galaxy light. In this section we de- 
scribe our method of modeling and subtracting the host 
galaxy light profile to allow us to identify and perform 
accurate photometry on nearby objects. Note that while 
we model host light in small regions around each host, we 
do not limit our analysis to objects within this modeled 
region. We use the GOODS catalogs for photometry and 
astrometry for all objects outside of the modeled cutout 
region. 

3.1. Host Galaxy B-Spline Models 

Producing a good model for the light profile of the host 
galaxies is critical for studying the population of faint 

SEXTRACTOR 's AWIN.IMAGE 
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Figure 1. From top to bottom: The distribution of host redshifts, 
absolute V band magnitudes, and stellar masses in th ree host com- 
plet eness ranges (Am < nimax — nihost) taken from lBundv et aLl 
|200a '). 



objects near the host. We use a multi-step process to 
create an accurate model of the host light profile which 
does not include the light from nearby objects. 

We model the host light profile in a small cutout cen- 
tered on the host galaxy. We choose this cutout to be 
20 X 20 i?h in order to ensure that all significant levels 
of host light are removed. In the first step of the model- 
ing process, we use SExtractor to identify all objects 
in the cutout region. The SExtractor parameters at 



this stage are chosen to err on the side of identifying noise 
peaks as objects in order to ensure that all real objects 
being obscured by the host light are recognized; see Ap- 
pendix [X] for a detailed description of the SExtractor 
parameters and how they were chosen. SExtractor 
outputs a segmentation map which we use to mask out 
all identified objects other than the host galaxy. SEx- 
tractor also returns measurements of the axis ratio 
and position angle of the host galaxy which we use in 
addition to i?h to define an elliptical coordinate system 
for the cutout. 

Wc then model the host light in the masked image us- 
ing empirical polar B-spline models. We choose B-spline 
models because they quickly fit the observed light dis- 
tribution with a smooth model that is independent of 
PSF effects and is more flexible than Sersic models, for 
example. Our method is similar to that described by 
iBolton et a l. (2005, 2006). Each iteration of the B-spline 
code fits a model to the masked data, subtracts the 
model, and then identifies and masks new residual struc- 
tures. This process is repeated three times. The final 
B-spline model is then subtracted from the data to pro- 
duce a residual image, which is used to perform object 
detection and photometry for field galaxies near the host. 
Further details of the modeling and masking procedure 
are provided in Appendix |3 Examples of our host sub- 
traction procedure are shown in Figure [5] 

3.2. Object Detection and Photometry 

We use SExtractor to detect and measure the prop- 
erties of objects within the host-subtracted cutouts. To 
ensure completeness, we limit the sample to objects with 
MAG.AUTO Z85o< 26.5 magnitudes, where the GOODS 
images are virtually 100% complete, even close to the 
host galaxy as we will show below. Furthermore, we 
only study objects fainter than zg5o> 21.0 magnitudes. 
This is because our analysis relies on an accurate charac- 
terization of the background (see Section 15. 2p and very 
bright objects appear in low numbers with large fluctu- 
ations which can bias the number counts slope near a 
particular host. 

Because we use the GOODS catalog data outside of our 
cutouts, it is imperative that our detections and photom- 
etry after host light subtraction are as close as possible to 
GOODS detections and photometry. We confirmed this 
by running SEXTRACTOR with our parameters on a large 
un-modeled section of the GOODS field. We recovered 
virtually the same number of objects and our photometry 
was consistent with GOODS photometry. This compari- 
son is described in further detail in Appendix |B] 

Due to the relative brightness of hosts compared to satel- 
lites, there is a minimum radius at which wc will be able 
to accurately identify faint objects, regardless of how 
careful we are during the host modeling process. 

By simulating point sources at and fainter than our cho- 
sen limiting magnitude (26.5) at varying distances from 
a representative subset of hosts, we find the minimum 
radius for completeness to be 1.5 i?h (^ 0'.'9) for the 
vast majority of cases. The few cases for which highly 
flattened hosts have significant residuals extending to a 
larger distance are identified manually; for those systems, 
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appropriately larger inner regions are excluded from our 
analysis (see Appendix R)) . 

Note that in the redshift range we study, even the most 
intrinsically faint sources w ill have effective rad ii typi- 
cally less than O'.'l (see, e.g.. lde Riicke et al]|2009() which 
is smaller than the FWHM of the GOODS PSF. Thus 
these sources are effectively point sources. We test our 
sensitivity to intrinsically faint sources by simulating 
faint exponential disks near our hosts at varying red- 
shifts, wi th effective radi i estim ated from the relations 
given by Ide Riicke et alj (|2OO90 . In the innermost re- 
gion, between 1.5 and 3.5 i?h, we failed to detect approx- 
imately 20%E| of 26.5 magnitude objects. Outside of this 
region we recovered approximately 100% of the simulated 
objects. For brighter objects with apparent magnitudes 
of 24.5 our recovery was approximately 100% complete 
even in this inner region. In our final analysis we study 
the satellite population between 1.5 and 45 i?h- Taking 
this into account, we can examine the worst case sce- 
nario in which all satellites are exponential disks with 
apparent magnitudes of 26.5. Assuming that satellites 
are distributed radially in projection as Psatir) oc r~^, 
we would only underestimate the final satellite number 
by 1-2 % because even in this extreme case, the incom- 
pleteness is confined to a small region. Thus we expect 
that the loss of completeness in the innermost region for 
the faintest sources will have a negligible effect on the 
analysis of the total sample. 

To compile a single catalog and avoid duplications, we 
compare the position of each object identified in the 
residual images to objects already in the GOODS cat- 
alogs. If the object is already in the catalogs (position 
within O'.'S), we replace the GOODS photometry and as- 
trometry with our own measurements, which do not suf- 
fer from being contaminated by host-galaxy light (see 
Figure [3]). The mean distance between 'matched' ob- 
jects was 4 ± 3 pixels, this corresponds to 0'.'12 which is 
the FWHM of the GOODS PSF. For objects that are 
not matched to objects already in the GOODS catalog, 
we add a new entry. Finally, for all objects outside of 
the host-subtracted cutouts, we use the measurements 
from the GOODS catalog and we do not attempt to de- 
tect new objects. In Figure [2] we show examples of the 
modeling process for a variety of host ellipticities and 
physical sizes, along with newly detected objects in the 
host-subtracted images. 

3.3. Objects Detected in Cutout Regions 

The host light subtraction has two important effects on 
our measurement of objects near the host galaxy. The 
first is that it removes host light contamination and al- 
lows for accurate photometry of nearby objects, as shown 
in Figure|31 This figure compares GOODS photometry to 
host-subtracted photometry for objects that had already 
been identified in the GOODS catalogs. As expected, we 
measure object magnitudes to be slightly fainter than the 
GOODS photometry after host subtraction, with a mean 

* The true incompleteness estimate requires knowledge of the 
object luminosity function. We make a rough estimate by averag- 
ing the results for objects at redshifts of 0.1, 0.4 and 0.8, where 
the redshift 0.1 objects are the most intrinsically faint and most 
difficult to detect. 



difference of 0.12 ± 0.02 magnitudes for objects within 8 
4") of the host galaxy. Our photometry is iden- 
tical to GOODS photometry without host subtraction 
(see Appendix [BJ, and we confirmed that the host light 
removal was accurate by simulating faint point sources 
near the hosts and ensuring that they were recovered 
with accurate photometry. Thus the difference in mag- 
nitude after host subtraction shown in Figure[3]is entirely 
due to the removal of the host light contamination which 
has a significant impact on photometry performed near 
bright objects. 

The second important effect of host light subtraction is 
that it allows us to detect new objects. If the newly de- 
tected objects are real rather than artifacts of the host 
light subtraction, then we expect that object properties 
such as brightness and elongation will be similar to the 
properties of objects that had already been detected in 
the GOODS fields. The effect of weak lensing on mag- 
nification and shear is negligible for the small number 
of background sources close to the host Einstein radii 
in projection (the affected region is typically an annu- 
lus of ^ I'.'O ± 0'.'2 for massive ellipticals). One way to 
see this is using the fact that the effect of weak lensing 
on the background number counts goes as iVobs/A^truc = 
where jj, is the magnification due to the host 
galaxy and /? is the power-law slope of the faint end of 
the background galaxy flux distribution (see Equation 
111 of Part 1 of Schneider, Kochanek and Wambsganss 
2006). In the case of zgso, /? is fairly close to one (about 
0.7) and thus we do not expect weak lensing to have a 
significant impact on our object detection. 

In the top panel of Figure|4l we compare the distributions 
of the axis ratios of the GOODS catalog sources and the 
newly detected objects near the host galaxies. The two 
distributions are indistinguishable, with a Komogorov- 
Smirnoff (KS) probability of being drawn from the same 
distribution of 0.96. The second panel of Figure |3] shows 
the distribution of contrast in MAG_AUTO ((5m= m — 
mh) between host and detected objects. Note the use 
of lower-case (5, which denotes a specific contrast from 
the host and is different from Am which describes the 
allowed maximum contrast between host and neighboring 
objects for a particular data set. The KS value for the 
two distributions being the same is 0.14. This value is 
low, but not low enough to rule out the possibility that 
the distributions are the same. Note that our improved 
host galaxy light subtraction procedure is important for 
detecting companion objects fainter than about (5m= 2.5 
magnitudes (which is about 0.5 magnitudes fainter than 
the magnitude contrast between the Milky Way and the 
Large Magellanic Cloud). 

As expected, the number density radial profiles are very 
different for newly detected objects and objects that had 
already been detected in GOODS (bottom panel of Fig- 
ure |4]), with a KS probability of 6.0 • 10~^^ for new and 
GOODS objects being drawn from the same spatial dis- 
tribution. The number density of new detections in- 
creases steadily towards the center of the host, while the 
number density of objects already in the GOODS cata- 
logs decreases. Host light subtraction triples the number 
density of detected objects in the inner 3". 
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Figure 2. Demonstration of galaxy modeling for a range of host ellipticities and sizes. Original images are shown with residuals after 
host model subtraction immediately below. Host galaxy physical size and ellipticity decrease downwards and to the left. All images are 
20 -Rh on a side. New object detections, made possible by the host light subtraction, are circled in red. Objects that were detected in the 
GOODS catalogs and that have complete photometry within the cutout (i.e. did not raise a SExTRACTOR flag greater than 2) are identified 
by blue squares. Some objects visible to the eye were omitted from our final catalog because they were fainter than our detection limit (for 
instance, the object just outside of the excluded region in the top right galaxy). The central region of the host, excluded in our analysis, 
is identified by a black circle. 



Satellite Spatial Distribution 



7 




Figure 3. Comparison of photometry before (zgso g) ^ii<i after 
(2:850) host subtraction for objects in cutout regions which had 
aheady been identified in the GOODS catalogs prior to host sub- 
traction. 

4. FIRST LOOK 

The host light subtraction discussed in Section [3] allows 
us to create an enhanced catalog of objects near the host 
galaxies. In this section, we show the radial and angu- 
lar profiles of objects in spatial bins in order to provide 
a qualitative sense of the properties of objects near the 
hosts. Binning is useful because it provides a visual rep- 
resentation of data. However, it is inherently limited 
because it requires data points to be averaged, thereby 
losing valuable information. Thus we do not perform our 
analysis on the binned data but instead use this section 
to justify our model choices in Section [5l 

4.1. Radial Distribution 

In Figure [5] we show the average number density of ob- 
jects as a function of distance from the hosts. The num- 
ber density of sources increases dramatically near the 
hosts. At large radii the number density becomes domi- 
nated by the isotropic and homogeneous distribution of 
objects not associated with the hosts. In Section [5] we 
will describe how we analyze the number density signal 
by inferring the combined properties of the satellite and 
background /foreground populations . 

4.2. Angular Distribution 

In Figure [5] we plot the angular distribution of objects 
within 20 i?h of the host galaxies, where t/i = is aligned 
with the host major axes. We show the distribution of \(j>\ 
only for hosts with q< 0.6 in order to ensure all object an- 
gles are well measured (for round hosts it becomes more 
difficult to measure the host position angle). The figure 
shows that objects appear with more frequency towards 
= than would be expected for a uniform distribution 
(shown by a dashed line) . We also compare the observed 
angular distribution of objects to a uniform distribution 
by applying a KS test which rules out the angular distri- 
bution being uniform with 95 % confidence. Recall that 
this is without any kind of attempt to separate the back- 
ground signal from the satellite signal. We investigate 
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Figure 4. Comparison of the properties of objects detected in the 
GOODS catalogs to those of newly detected objects. Top: The 
distribution of object ellipticities (1 — q = 1 — b/a). Middle: The 
distribution of magnitude differences from hosts {Sm= m — m^). 
Bottom: Number density of objects as a function of distance from 
the host. Newly detected objects are closer to the host than those 
in the GOODS catalogs. 

this asymmetry more rigorously in Section [5] 

5. JOINT MODELING OF SATELLITE AND BACKGROUND 
GALAXY POPULATIONS 

We have shown that the neighboring galaxies to the 
GOODS host galaxies exhibit non-trivial radial and an- 
gular distributions, and we now seek to model these dis- 
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Figure 5. Average number density of objects as a function of 
distance from the host in units of the second order moment of the 
host intensity profile along the major axis (i?h)- 
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Figure 6. Number of objects at angle 9 where S = is aligned 
with the host major axis within radius of 20 iJh for hosts with axis 
ratio less than 0.6. The dashed line shows the average bin heights 
for a uniform distribution. 

tributions. We start in § 15.11 by defining the satellite 
model and the parameters we aim to constrain. In § 15.21 
we discuss the background model and the priors we will 
be using. In § 15.31 we summarize our inference method- 
ology, which is described in more detail in Appendix [Dl 
A summary of all model parameters and their priors is 
given in Table [T] Our implementation of the model and 
inference algorithm has been extensively tested by means 
of simulated data. To guide the reader, a schematic of 
a possible realization of our satellite plus background 
model is shown in Figure [T] 

5.1. Satellites 

The scale-free nature of ACDM results motivates us to 
construct our model for the spatial distribution of the 
satellite number density as a function of corresponding 
host parameters. For simplicity, and owing to the rela- 
tively low signal-to-noise ratio of our measurement, we do 
not allow for intrinsic scatter in these scaling functions. 
Thus our inferences connecting host and satellite prop- 
erties are averages for the entire population; individual 
systems will differ from the average. 



We divide our discussion of our model for the satellite 
population into three subsections. We describe the prob- 
ability of finding a satellite at a certain position in Sub- 
section [SIlTTl We discuss choosing a region in which to 
search for satellites in order to normalize the spatial dis- 
tribution in Subsection 15.1.21 Finally the probability of 
finding a certain number of objects around each host is 
discussed in Subsection 15 .1.31 

5.1.1. Satellite Spatial Distribution 

ACDM simulations indicate that the number density of 
satellites follows the mass density profile of their host 
galaxy (Kravtsov 2010). In turn, observations indicate 
that the three dimensional total mass density profile of el- 
liptical galaxies can be approximated by a power law p~'^ 
with ^' 2, (with <10% scatter) fe.g. IKoopmans et al.l 
[20091: lAugereFaniMol) . The po wer law profile seems 
to extend as fa r as 1 00 Re {e.g. IGavazzi et al.l [20071 : 
iLagattuta et al.l I2O1O0 which corresponds to approxi- 
mately 70 i?h- We will thus model the radial distribution 
of satellites as a power law, with projected logarithmic 
slope 7p = 1 — 7'. 

We relate the angular light profile of the hosts to the 
mass profile following the resul t s from the Sloan Lens 
ACS (SLACS) by Bolt on et al.[ ([2008I . hereafter B08). 
BOS used gravitational lensing to study the relationship 
between the mass and light distributions of massive ellip- 
tical galaxies. BOS found that the major axes of the light 
profiles of the galaxies they studied were well aligned 
with the total central mass profiles and that the axis ra- 
tios of the light and mass profiles were the same within 
measurement errors; note that this agreement was es- 
tablished within the Einstein radii of the lensed galaxies, 
which corresponds to roughly ~ l"(see also [Kochaiiea 
120021) . The majority of our hosts are well represented 
by the properties of the SLACS lenses which are mas- 
sive elliptical galaxies with stellar mass es in the range 
- 10i°-5 to lO^^M© (jAuger et al.l [2009l f[. This implies 
that we can expect the angular distribution of satellites 
to roughly follow that of host light and therefore that it 
is reasonable to construct a model of the satellite angular 
profile which is related to parameters which describe the 
host light angular profile. 

We model the satellite distribution in an elliptical coor- 
dinate system described by an angle which is the same 
as the normal polar angle, and a radial coordinate R' 
related to Cartesian coordinates x, y by: 

R' = V^^ + yVi^ (1) 

Where q is the ratio between minor and major axes of the 
elliptical coordinate system. The probability of finding a 
satellite on an elliptical contour within the area element 
R'dR'dO' , goes as a power law with power 7p 

Ps{R')R'dR'dd' oc + / q^y^/^dxdy. (2) 

^ We assume that the small number of hosts (^^ 10) in our sam- 
ple which were less massive do not deviate significantly from the 
relationship between mass and light orientation observed in their 
more massive counterparts. 
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Changing to polar coordinates gives 

Ps{r,0\(t),q,jp) oc 

r^p[cos2 (61 - 0) + sin^ (9 - (fy^'^l'^rdrde 



(3) 



where we introduce (j) to allow for some offset between 
the major axis of the satellite elliptical coordinate sys- 
tem and the major axis of the host galaxy; = 0° corre- 
sponds to parallel alignment, and ((> = 90° corresponds to 
perpendicular alignment (see Figure [T]). This is different 
from the coordinate 0, which describes the offset between 
a particular object and the host major axis. We are only 
interested in the magnitude of the offset between the host 
light profile and the satellite population so we infer |0|, 
the absolute value of the offset of the distribution from 
the host major axis. 

In general the offset in 3 dimensions is related to the 
2D projected offset in a non-trivial way because of the 
random orientation of the host-satellite system in the 
plane perpendicular to the projection plane. However, 
for many interesting and plausible scenarios, the align- 
ment in 3 dimensions relates in an obvious way to the 
observed, projected offset 0. Namely, if the satellites are 
aligned with the host light distribution in 3D then the 
projection of the two systems will also appear aligned. 
Similarly, if the systems are anti-aligned, the satellites 
will appear perpendicular to the host in projection. Fi- 
nally, if the satellites are oriented randomly with respect 
to the host in 3 dimensions, their projection will ap- 
pear isotropic. Thus the projected relationship between 
the satellite spatial distribution and the host light pro- 
file contains relevant information about the 3D distribu- 
tion. 

We also aim to determine the connection between the 
ellipticity of the host light and that of the satellite dis- 
tribution. For this purpose, we define the ellipticity to 
be: 



1 



1 + 



(4) 



We introduce a parameter A to relate the ellipticity of 
the host profile, eh, to that of the satellite distribution. 



l + eh(^-l) 



(5) 



We choose this parametrization because it returns valid 
ellipticities (0 < e < 1) for all values of ^ > 0. This is 
convenient when exploring the A parameter space with 
our MCMC code. The parameter A can be understood 
as follows: when A = 0, the satellite distribution is al- 
ways round regardless of how flat the host distribution is; 
when A = \ the satellite distribution has the same flat- 
tening as the host distribution; and for values oi A> 1, 
the satellite distribution is more flat than the host light 
distribution. Note that formally A goes from zero to in- 
finity. As A approaches infinity, the satellite distribution 
approaches q = Q. We are interested in a qualitative char- 
acterization of the satellite fiattening so we simplify our 
inference on A by restricting our analysis to < j4 < 2, 
which will allow us to distinguish between isotropic and 
flattened satellite distributions without having to explore 
an unnecessarily large parameter space. 

5.1.2. Spatial Normalization 

We choose a maximum and minimum radius in which to 
search for satellites in order to normalize our distribu- 
tion. These radii are determined by observational con- 
straints. The inner radius is constrained to be Tmin = 
1.5i?h by our ability to accurately measure faint object 
magnitudes near the host (see Section [3^ . For the outer 
radius we choose Tmax — 45i?h which corresponds to ap- 
proximately 140 kpc. This choice is a compromise be- 
tween studying an area large enough to find LMC/SMC 
equivalent objects (which are ~ 60 kpc from the Milky 
Way), and keeping the area small enough to limit over- 
lap between host galaxies and allowing us to reason- 
ably apply a background prior determined by the entire 
field. 

The normalized radial probability distribution is 
thus: 



^.(^l7p) 



7p 



^7p+2 _ 7p+2 
. 'max 'min . 



(6) 



10 



Nierenberg et al. 



The normalization of the angular distribution is given by 
a generalized elliptical integral. 

As discussed in Section [3?2l a few of our systems are not 
complete to the same inner radius due to issues with light 
modeling for highly flattened hosts. Furthermore, some 
of the regions far from the hosts are not complete to 45 
due to GOODS field edge effects. We discuss how we 
account for this incompleteness in Appendix [Cj 

5.1.3. Number of satellites per host 

Multiple dark matter simulations have found that the 
number of satellites with a given mass relative to the 
host mass is constant (|Moore et al.lll999l : iKravtsov et al.l 
[200llGao et"all[200l . With this in mind, we model the 
number of satellites within a fixed magnitude range from 
the host as being drawn from a Poisson distribution with 
some constant mean Ng. 

It is important to keep in mind the following caveats. 
Due to baryonic physics, the observable properties of 
galaxies are not exactly scale invariant. In fact, the virial 
mass to light ratio is not a universal function and we ex- 
pect it to vary for the host galaxies and satellites. This 
means that in a fixed magnitude range we are not actu- 
ally probing a fixed virial mass range, but only a fixed 
range in luminosity ratio. However, it should be noted 
that our host galaxies are typically luminous enough 
(several IG^^Lq) that their satellites are also significantly 
brighter than the typical Local Group dwarf galaxies 
where the M/L is believed to be much higher than in mas- 
sive ellipticals. Thus our satellite mass range is better 
characterized by the relative luminosity between host and 
satellites than it would be for the Local Group. 

Naturally, these assumptions will have no effect on our in- 
ference of the parameters of the satellite spatial distribu- 
tion, provided that it is also independent of host galaxy 
luminosity within the spatial and mass ranges considered 
here. 

5.2. Background/ Foreground Objects 

In addition to satellites, each of the host galaxies is 
surrounded in projection by background/foreground ob- 
jects. We isolate the satellite signal by using the prop- 
erties of the entire GOODS fields to provide a strong 
constraint on the background number density. 

Galaxies tend to cluster on scales o f order a typical 
galaxy virial radius, or ^ 400 kpc (e.g. 'Totsuii & Kihara' 
1969; Peebles 1974; Braincrd et al. 1995; ViUumscn et al. 
19971: iZehavi et al.1120021: IMorganson fc Blandfordll2009l ). 
This clustering is believed to be due to t he accretion of 
matter along dark matter filaments (e.g. iBensonI I2010L 
and references therein). Because of clustering, the den- 
sity of objects tends to be higher in regions near bright 
galaxies an d to have fluctuati on amplitudes larger than 
Poissonian. iChen et al.l ()2006[ ) tested a variety of meth- 
ods for removing 'interloper' (background/foreground) 
contamination from their satelli t e sign al using a set of 
ACDM simulations. iChen et al.] ()2006f ) found that sim- 
ply estimating the number density of background objects 
by studying randomly selected regions in their field sig- 
nificantly underestimated the contamination from fore- 



ground/background objects that appeared near their 
hosts in projection. They found the most reliable es- 
timate of the background was obtained by measuring 
the background in annuli centered on their hosts and 
just outside of the area in which they studied the satel- 
lite population. We adopt this method to build a prior 
on the background near the hosts. We study the back- 
ground in annuli between 45 and 60 i?h (typically 140- 
180 kpc). We choose this distance range to ensure that 
we are not including satellites in our estimate, while 
still accurately characterizing the clustering of galaxies 
near the hosts. We find that the density of objects near 
the hosts with magnitudes within our detection range 
(21 < Z850 < 26.5) is Sb^o = 125 ± 2 arcmin"^. As ex- 
pected, this is higher than the average density of objects 
in the GOODS fields (117 ± 0.6) and has larger fluctua- 
tions than one would predict from Poisson noise. 

Recall that we are studying objects brighter than a fixed 
magnitude contrast from the host magnitudes. This 
means that the number of objects we study around a 
given host is a fraction of the density of objects brighter 
than zg5o = 26.5. We correct for this around each of the 
hosts by representing the cumulative distribution func- 
tion (CDF) of the background number counts by a power- 
law (e.g. .Benitez et a l. .2003) 

Nb{< m„,ax) (X 10"^"— (7) 

where ab is the slope of the background number counts. 
The maximum magnitude TOmax for a particular host sys- 
tem is 

JTlmax = "Ih + Am. (8) 

Thus for a given choice of Am, the expected number 
density of background/foreground objects around the j^^ 
host is 

Eb.j = i]j,_jo"'=('"'''J+^"-">'"). (9) 

We measure the background slope to be — 0.28 ±0.01 
in the same annuli near the hosts in which we estimate 
the number density of the background. 

5.3. Analysis 

In the previous two subsections we constructed a model 
which describes the probability of a satellite or fore- 
ground/background object appearing a given distance 
from the host dependent on a choice of parameters. The 
parameters and their priors are listed in Table [H Of key 
importance in our work is that our results are unbinncd 
and analyzed in a fully Bayesian fashion. This means 
that for each parameter our result is a posterior prob- 
ability distribution function (PDF) which describes the 
probability of a value of a parameter being true given 
our data (the likelihood) and our prior knowledge of the 
parameter (the prior). 

In Appendix iP] wc discuss the details of c onst ructing the 
posterior probability function (Equation ID4|) which we 
use to study our parameter posterior PDFs. We compute 
the posterior PDFs using a Markov Chain Monte Carlo 
(MCMC) method. At least 10*^ iterations per chain are 
performed in order to ensure convergence. 

6. RESULTS 
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Table 1 

Summary of model parameters 



Parameter 


Uescription 


Prior 


Satellite Model 
Ns 

7p 
A 

101 


Number of satellites per host 

Logarithmic slope of the satellite radial distribution 

Flattening of the satellite number density distribution relative to host light flattening 

Offset between the major axis of satellite spatial distribution and the major axis of the host light profile. 


U(0,20) " 
U(-5,0) 
U{0,2) 

U(0,7r/2) 


Background Model 






ah 


Number density of background objects with magnitudes between 21 < ^gso < 26.5 
Logarithmic slope of the 2:350 background number counts 


N(125,2) 
N(0.28,0.01) 



^ U(a,b) denotes a uniform distribution between a and b. 

'° N(/i,cr) denotes a normal distribution with mean ^ and standard deviation a. 



We first discuss the results for the A?7i = 5.5 magnitude 
bin, which has the strongest signal. Table [2] contains a 
summary of the results of the inference. The posterior 
PDF for each variable is shown in Figure HI 

The first main result is the clear detection of a peak 
in the posterior PDF for Ng which is well removed from 
zero, indicating the detection of a population of satellites. 
Secondly, we find that the radial density profile of the 
satellite spatial distribution is consistent with isothermal 
(7p = —1). Thirdly, the posterior PDF of 4> is peaked 
at zero indicating that the satellite spatial distribution 
is preferentially aligned with that of the host. A KS test 
shows that a uniform distribution of angles is ruled out 
at more than a 99.99% CL. 

Our results are inconclusive for the relationship between 
ellipticity of the satellite distribution and that of the 
host, described by the parameter A (see Figure [5]). This 
is not surprising as it takes a stronger signal to mea- 
sure the ellipticity of a distribution than to measure its 
alignment. Our inability to infer A is in part due to a 
degeneracy between (f) and A. If A is zero, all values of 
(j) are equally probable. This can be seen in the contour 
plot shown in Figure [H] We show the effects of removing 
this degeneracy by performing a separate analysis on the 
Am — 5.5 data set, keeping (f) fixed at zero. The results 
for this are shown in Figure El When (j) is fixed at zero, 
the inference disfavors small values of A, which implies 
a disfavoring of satellite distributions that are rounder 
that that of the stars. 

We also verified that our results are robust to small 
changes in limiting magnitude by repeating the analysis 
for Am — 6.0 and 5.0 magnitude bins. The Am = 6.0 bin 
clearly shows the presence of a satellite population, with 
the same isothermal slope observed for the Am = 5.5 
range and the same angular alignment with the host ma- 
jor axis. The inferred satellite number is also consistent 
with the Am = 5.5 measurement, although there is a 
longer tail towards higher satellite numbers which might 
indicate a slightly higher number, as expected given the 
fainter limit. As discussed in Section [21 the number of 
hosts that are complete almost halves from Am — 5.5 to 
6.0 so the errors are larger for the Am — 6.0 inference 
overall. Interestingly, the analysis for Am = 5.0 did 
not conclusively detect a satellite population; the num- 
ber of satellites is marginally more than 1-a greater than 
zero, and the uncertainties on the parameters describing 
the spatial distribution are similarly larger. This shows 
the crucial need for deep data in performing this kind of 



measurement. In all cases we recover our priors on the 
background parameters ab and 5]b,o- We are not able 
to constrain these numbers further because in our model 
they are both degenerate with each other and with the 
number of satellites. 

An important degeneracy is that between Ng and 7p. 
The inferred number of satellites is larger for a shallow 
radial profile and vice-versa. From theoretical and ob- 
servational arguments (see the Introduction) we expect 
the radial profile to be close to isothermal and we expect 
alignment between host and satellites. It is thus instruc- 
tive to repeat the analysis by fixing these parameters to 
their expected values 7p = — 1 and (/) = 0, thereby elim- 
inating many of the degeneracies. The resulting poste- 
rior PDFs for Am = 5.5 are shown in Figure fTOl while 
results for all magnitude ranges are summarized in Ta- 
ble O 

As expected, fixing 7p and (j) lowers the uncertainty of 
our inference and the detection of satellites becomes more 
significant. It is also easier to compare the results across 
magnitude bins, and we see how the number of satellites 
indeed increases with Am and is consistent with a cumu- 
lative luminosity function going as (i.e. similar to 
the cumulative mass function predicted from theory as- 
suming m oc L~^), albeit with large errors. Furthermore, 
the posterior PDF of A begins to disfavor a satellite pop- 
ulation that is more isotropic than the stars in the galaxy. 
This is also broadly in line with expectations, as we ex- 
pect that the distribution of stars has been made rounder 
than the host halo by dissipational processes. 

7. DISCUSSION 

This is the first measurement of the numbers and spatial 
distribution of faint satellites of early-type galaxies at 
intermediate redshift so it is useful to provide some con- 
text for our result. Our measurement of the radial slope 
of the satellite spatial distribution is consistent with an 
isothermal distribution, i.e. 7p « —1. This result is 
similar to that found in WIO, and the radial distribu- 
tion of satellites appears to be consistent with that of 
the total mass distribution measured in lensing studies 
(e.g., iKoopmans et al.l 120091 : lAuger et all |2010|) . How- 
ever, given the uncertainty on the inferred slope, the 
spatial distribution of satellites is also consistent with the 
NFW profile inferred by COS. In the radial range covered 
by our work (^-^5-140 kpc), an NFW profile is character- 
ized by a radially averaged projected logarithmic slope of 
approximately 7p —1, with large variations depending 
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Table 2 

Posterior Medians/Confidence Intervals 



Am 


7p 


Ns 


101 

(68% confidence) 


A Sb,o(Nb/arcmin^) 




6.0 
5.5 
5.0 


_1 1+0-5 

-1 0+"=* 
^•'-'-0.4 

-0 

'■'■°~o.s, 


-1 7+0.9 
-'^ '-0.8 

"■0-0.4 


< 44 

< 42 

< 56 


- 1241^ 


r, r,oo + 0.009 

U.ZOO_Q QQg 

n 9Sfi+0-009 

n 909+0.009 

U.ZOZ_Q QQg 



^ No inference on the parameter because the posterior distribution is approxi- 
mately uniform. 



Table 3 

Posterior Medians/Confidence Intervals with fixed parameters 



Fixed parameter 


Am 


7p Ns 


A 

(68% confidence) 


(Nb /arcmin'^ ) 




</> = 


5.5 


1 n+0-3 -1 7+1.0 
-'-■'-'-0.4 -'-•'-0.8 


> 0.73 


123t2 


0.286to°Qgg 


= 0, 7p = -1 
= 0, 7p = -1 
= 0, 7p = -1 


6.0 
5.5 
5.0 


2+^ 

-1 s+o--" 
^■'^-o.e 


> 0.83 

> 0.72 

> 0.73 


124^2 
1241^ 

124 j;2 


n 9Qc:+0.009 

U.Zi50_Q Qgg 

n 9SS+0-008 

n 90-1+0.008 

U.ZC51_Q QQg 



on host scale radius. More accurate nieasurements of the 
radial density profile of satellites are needed, in addition 
to a comparison of the radial profile around different host 
masses, in order to determine whether the distribution 
of satellites follows that of the total mass or that of the 
dark matter component. 

The distribution of satellites is anisotropic and is pref- 
erentially aligned along the major axis of the host 
light profile. This host-satellite alignment is consistent 
with o bservations of host-satellite systems in SDSS (e.g. 
iBraine rd 2005). Furthermore, alignment is predicted 
by ACDM simulations which show satellites accreting 
anisotropically along filaments and appearing aligned 
with the major a xis of the host mass profile (e.g., 
lAubert et aDf200l . Previous observations have estab- 
lished that the host light profile aligns with the host 
mass profile (BOS), and our observation of the satellite- 
host light alignment therefore implies alignment between 
the satellite distribution and the host mass, in agreement 
with simulations. This alignment between host mass and 
satellite spatial distribution has important implications 
for the frequency of flux ratio anomalies, as we discuss 
below. 

Although we do not expect the number of satellites 
of intermediate redshift elliptical galaxies to be exactly 
equal to that of galaxies at lower redshift due to bary- 
onic processes and evolution, it is instructive to com- 
pare the GOODS satellites to other satellite populations. 
Figure [TT] compares the cumulative luminosity function 
(CLF) of our satellites to the CLF of the Milk y Way 
and A ndromeda satellites (adopted from Toller ud et alj 
20081). with o ne sigma uncertainties calculated from 
Gehrelsl (pSfih . and to the CLF of SDSS satellites of 
hosts with varying morphologies adopted from COS. Ide- 
ally, we would have also liked to compare our study to 
JIO which is one of the few studies of satellites of high 
redshift objects. However, the maximum value of Am 



in that survey was about 2.5 magnitudes which is too 
bright to be compared to our measurement in a mean- 
ingful way. 

COS measured the number of satellites and their radial 
profile between approximately 20 and 350 kpc. In or- 
der to compare our results appropriately, we use the COS 
'interloper subtracted' fit to a radial power law given in 
their Table 2 in order to extrapolate their measurement 
inward toward the host. We assume that the power law 
is constant in the inner regions and use a fiducial value 
of i?h 3 kpc to estimate the number of satellites COS 
would have seen in the region that we studied (i.e., as 
close as 1.5 i?h). The Milky Way CLF is complete in 
the same equivalent region as ours so we did not have 
to make any adjustments in order to compare our num- 
bers. All three studies measured magnitudes in differ- 
ent filters; COS used r band magnitudes at an effective 
redshift of z — 0.1, we use observed-frame 0350 magni- 
tudes, and Milky Way measurements are in rest-frame 
V . However, the corrections required to convert from r 
at redshift '^0.1 and observed zgso at redshifts of ^ 0.5 
into rest-frame V are negligible compared to the sizes of 
the bins of Am that we study, and we therefore do not 
include explicit /c-corrections in our analysis. The three 
satellite CLFs are roughly consistent where the measure- 
ments overlap, given the relatively large error bars (Fig- 
ure 11). It is worth noting that our GOODS satellite 
measurements have significantly smaller error bars than 
those for the Local Group by virtue of the large sample 
of hosts we study which allows us to reduce the sampling 
error. Furthermore, we are able to observe much fainter 
satellites than the SDSS study. 

It is also instructive to compare our inferred satellite 
number to the number of minor mergers our hosts are 
expected to undergo in the time span we study. As many 
of our satellites are very near their host galaxies, we ex- 
pect that some of them will be close to merging and that 
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Figure 8. Bivariatc posterior PDFs for all model parameters for the Am = 5.5 data set. The dark and light contours contain regions of 
68 and 95% of the probability respectively. The diagonal shows the marginalized PDF for each parameter. 
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our estimate of the satellite number should account for 
the predicted minor merger rate in the mass range we 
study. The merger rate depends on th e ratio between 
the host and satellite virial masses (e.g.. iFakhouri et alJ 
I2010D . Our typical host has a stellar mass of M* ~ lO^^-^ 
M0 (see Fi gure [It. Froni abun dance matching tech- 
niques (e.g.. iBehroozi et aI1l2010[ ). this corresponds to a 
halo mass of about 10^^ M0. In the luminosity range of 
our satellites the stellar mass to light ratio of the satel- 
lites should be approximately in the range 30-100% o f 
that of the host galaxy (e.g., IKauffmann et all I2003D . 
Using this, the stellar mass of the satellites we study 
is in the range of 0.6 — 2 • 10^ M©. This corresponds 
to virial masses of ~ 2 — 8 • 10^° Mp,, according to the 
best fit to Equation 21 in IBehroozi et all (|2010f ) . Thus 
the we can detect satellites with virial masses of order 
a percent of that of their hosts. This estimate is sub- 
ject to large uncertainties, including those arising from 
tidal forces that tend to remove dark matter from satel- 
lites as they move closer to their host galaxy, thus re- 
duci ng their vi rial mass. Using the best fit to Equation 
1 in IFakhouri et al.. (.201tf ). we estimate that the major- 
ity of our hosts (60%) will undergo about 0.4-0.5 merg- 
ers (depending on the satellite stellar mass to luminosity 
fraction) between redshifts 0.1 and 0.8. It is reassur- 
ing that this number can be easily accounted for by our 
inferred satellite number. Of course, a quantitative com- 
parison requires a detailed evaluation of the merging and 
visibility timescales, which is beyond the scope of this 
paper. However, we will re-visit this point in a future 
paper, as minor mergers have been suggested to play 
an impo rtant role in the e vol ution of ear ly-type galax- 



ies (e.g. [B undy et al. 2007; Bovlan-Kolc hin et al 
Naab et al.|.2009; .Nipoti et al...2009,: .Kavirai et al 



iNaab et ai.i.:^UUU; .iNipoti et ai...:^uuu,: ,Kav 
Hopkins et al.ll2Q10l: Kavirai et al.ll20Tli r 



2008 



2009; 



Finally we discuss the effects our detected satellite pop- 
ulation would have on flux ratio anomalies in quadruply- 
imaged strong gravitational lens systems. We limit our 
discussion here to order of magnitude estimates and gen- 
eral trends, leaving a rigorous quantitative analysis to a 
future paper. 



iDalal fc Kochanekl ()2002i hereafter D02) used the mag- 
nifications of images in five quad lenses to infer a mass 
profile for the lens galaxies. In order to reproduce the 
observed magnifications, D02 inferred the presence of 
a non-smooth component to the mass distribution with 
a mass fraction between 0.6 and 7 % near th e lensed 
image s. iMao et all (I2004D and more recently Xu et alJ 
(|2009l hereafter X09) used ACDM simulations to test 
whether dark substructure embedded within smooth ha- 
los could produce the flux ratio anomalies observed by 
D02. These studies estimated that the mass fraction of 
CDM substructure from their simulations would only be 
about 0.1 % near image positions, which is insufficient to 
produce the fiux ratios studied by D02. The inclusion of 
globular cluster popula tions and baryo nic processes does 
not change this result (|Xu et al.ll2010D and underscores 
the importance of constraining the satellite population 
through direct observations. 

From our inference, a few percent of our hosts have a 
satellite within the annulus that would include the typ- 
ical lens image positions studied by X09 (the Einstein 



radius is approximately 1"). We therefore find that the 
average mass fraction in this annulus is of order a few 
percent. Using the mass function (dn/dm cx m~^'^) 
described by X09, we estimate that the average satel- 
lite mass fraction they would have observed in our mass 
range is of order a percent, which is consistent with our 
result. It is worth considering the possibility that the 
presence of a few relatively massive satellites may in- 
crease the lensing cross-section significantly so that any 
attempt to measure the satellite mass function using 
gravitational lenses would be automatically biased to- 
wards very high mass fractions near the lens Einstein 
radius; we will explore this possibility in more detail in 
paper HI. 

It is interesting to note that, although our typical hosts 
have similar halo masses to those studied in X09, the 
most massive subhalo X09 observed in any of their sim- 
ulations has a virial mass of ^ 10^° M 0, which is on 
the very low mass end of the subhalos studied in this 
work. Our simple extension of the virial to stellar mass 
relationship led us to estimate that our hosts will have 
on average more than one halo at least this massive even 
assuming that the mass to light ratio of our satellites is 
only a few percent of that of the host halo. This discrep- 
ancy may be due to the fact that our mass estimate is 
very approximate and, as already discussed, may not be 
appropriate for systems undergoing strong interactions 
such as tidal stripping. On the other hand, the difference 
in estimated satellite masses may indicate the impor- 
tance of baryonic condensation in preserving th e mass in 
subhalos during accretion (see e.g.. lWeinberg et al.ll2dd8l : 
iRomano-Dfaz et al.l[2009l ). 

Our observation that satellites appear preferentially 
along the host major axis may offer an additional route 
to reduce or eliminate the apparent discrepancy in the 
satellite mass fraction a s inferred by fiux ratio anoma- 
lies. As pointed out bv iZentnerl (12006 ). an anisotropic 
satellite distribution can increase the effective projected 
mass of the host 'felt' by a lensed image by as much as 
a factor of six. Thus, the mass associated with our ob- 
served luminous satellites could in principle be sufficient 
to explain the anomalies detected by D02. 

Our satellites are much fainter than a typical lensed 
quasar image and thus would be easily obscured in any 
of these lensing studies. Our result highlights the im- 
portance of studying the satellite population in non-lens 
systems in conjunction with the analysis of perturbations 
in lens systems in order to attain an understanding of the 
mass and luminosity functions of satellites. 

To conclude, it is important to point out that the largest 
discrepancies between the luminosity and mass function 
are seen for ev en fainter satelli tes than the ones probed 
by this study (iKravtsovl l2010t ). It is thus desirable to 
push the analysis of both the luminosity and the mass 
function of satellites to even smaller fractions of the host 
galaxy properties. 

8. SUMMARY 

We use an advanced host light subtraction method to 
study the spatial distribution of faint galaxies around 
127 early-type galaxies in the GOODS fields between red- 
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shifts 0.1 and 0.8. We employ a self-consistent model in 
the framework of Bayesian inference to disentangle the 
satellite population from background/foreground galax- 
ies. Exploiting the depth and resolution of the HST im- 
ages, we detect satellites up to 5.5 magnitudes fainter 
than the host galaxy and as close as 0'.'5/2.5 kpc to 
the host. Our main results can be summarized as fol- 
lows: 

1. Intermediate redshift, massive, early-type galaxies 
have on average 1.7j^Q'g satellites within our lumi- 
nosity ratio limit. This is consistent with the num- 
ber of satellites observed in the Milky Way. 

2. The number density of satellites follows an ap- 
proximately isothermal radial power law profile 
P{r) oc r-'p with 7p ^ -1.0tEJ;|. 

3. The satellites are preferentially aligned along the 
major axis of the host light profile with |(/)| < 42° 
at the 68% confidence level. 

4. When the offset (j) between the satellite population 
and host light profile is fixed at degrees, the satel- 
lite distribution is inferred to be preferentially more 
elongated than the distribution of host galaxy light. 

5. When the satellite number density profile is as- 
sumed to be isothermal, the average number of 
satellites is inferred to be l.S^n l. 
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Figure 11. Comparison of the measured number of GOODS satel- 
lites to those found for the Milky Way and SDSS galaxies. 
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Table Al 

SEXTRACTOR Parameters 



Parameter 


Value 

Large Object Mask Point Object Mask 


Final Photometry 


DETECT.MINAR,EA 


10 


5 




DEBLEND_NTHRESH 


64 






DEBLEND.MINCONT 


0.001 






DETECT.THRESH 




2.5 




ANALYSIS.THRESH 




2.5 




FILTER.NAME 




gauss_2 . 0_3x3 . conv 




BACK.TYPE 


MANUAL 


MANUAL 


MANUAL 


BACK.VALUE 


0.0 


0.0 


0.0 


Note. — Parameters not 
those used by the GOODS 


listed, or marked 
team and can 


as "-" , are 
be found at. 



http : //archive . stsci . edu/pub/hlsp/goods/catalog_r2 

APPENDIX 

A. HOST GALAXY LIGHT SUBTRACTION 

As we showed in Section [3l the detectability of satellite galaxies is quite sensitive to the presence of light from the 
much brighter host galaxy. In this appendix we give more details on how the light from the host galaxy was subtracted 
in order to allow the faint satellites to be detected. Host galaxy surface brightness modeling is a two step process. 
In the first step, we identify and mask all objects other than the host in the region in which we want to model 
host galaxy light. This ensures that our host model does not attempt to fit the light of nearby objects. Once the 
objects are identified and the mask is created, in the second step we model the host galaxy surface brightness in 2D 
using the unmasked parts of the image, interpolate over the masked parts, and subtract this model from our original 
image. 

A.l. Masking 

The masking is done automatically using the segmentation map produced by an initial SExtractor run, with 
parameters tuned to optimize the detection of small faint objects in the presence of the host light. This optimization 
was performed by simulating faint companion objects at varying positions around our hosts. We found that using 
the GOODS SExtractor parameters at this stage gave unsatisfactory results, in that objects we simulated near 
the hosts were not being masked. We found that, in order to properly account for both faint point sources near 
our hosts and diffuse sources further from our hosts, we needed to use a superposition of masks created with two 
different sets of SExtractor parameters. These parameters are listed in Table \K\\ With this masking routine, we 
were able to accurately recover magnitude 27 point sources as close as 1.5 i?h (typically ~ 0'.'9) from our host galaxy 
centroids. 

A. 2. B-spline model subtraction 

In Figure [5] we show some representative examples of our host galaxy light subtraction process. Some of our hosts 
are very elongated or show disk-like features. The surface brightness distribution of these objects is more difficult to 
model and tends to leave residuals at larger radii than the rounder hosts. Residuals are easily identifiable because of 
the symmetric pattern and distinct elongated shape that they appear in (see, e.g., the residuals of the upper right host 
in Figure [2]). We deal with this small number (~ 20) of more disky/extended sources in two ways. The first way is to 
increase the amount of flexibility in the B-spline fit, increasing the number of multipoles fitted in each of our spline 
rings around the ellipse. The second thing we do is to exclude a larger inner region from our analysis to ensure we are 
not identifying any residuals as objects. 

B. SOURCE EXTRACTOR PARAMETERS AND PHOTOMETRY COMPARISON 

After subtracting the main galaxy light out of the image we run SExtractor on the residual image to identify the 
remaining objects. In this step we modify our SExtractor parameters by increasing the deblending threshold in 
order to match t he p arameters used when making the GOODS catalogs. A full list of our SExtractor parameters 
is listed in Table ET] 

Even after matching the deblending parameters, there are two minor differences between our final SExtractor pa- 
rameters and GOODS SExtractor parameters. The first is that GOODS estimates the background by measuring 
the noise in an annulus with 100 pixel (3") width around each object. This method of background subtraction is not 
optimal for our measurement because we are focusing on a relatively small area (typical image size was ~ 200x200 
pixels), within which we expect to have a relatively high object density. Instead, we estimate the background by calcu- 
lating the 3 a clipped mean within our image. Furthermore, the GOODS team made modifications to SExtractor 
that changed the deblending process slightly. 
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We compare the MAG_AUTO output from SExtractor for the two methods in a 4 arcmin^ cutout fr om the GOODS 
field in order to study the effects of the different background subtraction and deblending (see Figure IBT|) . Using our 
parameters, we identify 484 objects with MAG_AUTO<26.5. Of these, 22 objects do not have a center within O'.'S of 
an object in the GOODS catalog. The major outliers in the MAG_AUTO comparison are all in areas of high object 
density and thus most likely due to differences in deblending. The mean difference (not including major outliers) in 
the MAG_AUTO estimate is (-1.9 ± 6.0) ■ lO^^. 




Figure Bl. Comparison of our photometry witli GOODS in a 4 arcmin field cutout with no host galaxy subtraction. 



C. COMPLETENESS 



In Section [5] we discuss our model for the spatial distribution of satellites which we can use to calculate the probability 
that there are a certain number of objects N within 1.5 < r/Rh < 45, with positions x. Before comparing a particular 
model prediction to the data it is necessary to account for observational limitations such as field edge effects. In this 
section we explain how we account for these observational limitations when inferring our model parameters. 

Following |Keii3 (l2007l) . we define a completeness function, P{I\x), which describes the probability of observing an 
object at position x, where 1 = 1 indicates an object is detected and / = indicates a non-detection. Note that 
because we have chosen to study only magnitude ranges in which we are complete, our completeness function will only 
depend on position in a given field. As discussed in Appendix |Al we measure our completeness near the host galaxy 
by placing simulated objects around the host and calculating our recovery rate and the accuracy of our photometry as 
a function of position. Far from the host, we estimate our completeness in annuli to account for field edge effects. To 
simplify this analysis, we ignore regions of partial completeness so that a region in the field is either 100% complete 
or 0% complete. 

Taking completeness into account, the likelihood of detecting N°^^ total objects (satellites and background), with 
positions {x}, given model parameters 6, becomes 

Fr{N°^'', {x}\e) = Pr(/|{a;})Pr(iV, {x}\e) (CI) 



Our fields have varying sizes but the model parameter N gives the number of objects we expect to find in an ideal 
field where we have 100% completeness between 1.5 and 45 i?h- As discussed above, some of our host systems have 
restricted areas, between rmn and rmx^ where the total area is smaller than in the ideal case. In these cases we must 
scale the model prediction by the probability of finding the object in the reduced area compared to the probability of 
finding the object in the ideal area, and we therefore define an updated prediction for the number of objects N' given 

by 

^. j;'-Piix'\e)dx' 

J^l Pr{x'\e)dx' 



N' = N %-^ ; , (C2) 



Note that is a smooth model prediction of the number of true objects in a field. In order to properly account for 
the fact that the true number of objects in a field is discrete, we must introduce a Poisson probability, which relates 
the "true" discrete number of objects to the model number of objects. The likelihood that we observe N"""^ objects 
at positions x given our model parameters, is a product over the likelihood of each of those positions being true given 
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our model parameters 

i 

Thus the probability that we observe N°^^ given a set of model parameters, taking into account varying total areas of 
completeness, is given by: 

Pr(Ar°''% {x}\9) = Pr(Ar°''"|iV') Y[ Piixi\9). (C4) 

i 

Properly, we should also include a term which accounts for the probability of not observing N — N°^^ objects. However, 
in our case, the region of parameter space we exclude due to observational limitations is always much smaller than the 
region we do study, and thus the number of objects our model predicts that we expect to observe is much larger than 
the number that we cannot observe. Therefore, for simplicity, we do not include this term in our analysis. 

D. INFERENCE METHODOLOGY 

In this section we discuss in detail how we characterize the posterior PDFs for our satellite model parameters 9s = 
{Ns,jp, A,(j)}, and our background model parameters 0b = {5]b,o,ab}- We present a top-down description of our 
model in which we start with the posterior PDF we would like to obtain and break it apart to show where we have 
inserted different pieces of information. 

Our final data set, D, is composed of a set of measurements for all of our host-field systems, D = {Di, D2, D^^} 
Where Dj = {hj,dj}, and hj is the set of measurements of the magnitude, axis ratio and RMS of the j*'' host 

light profile, and dj 



iVf^{(xl),(a;2),...(a;7V-)} 



is the number of objects observed around the j*'' host and their 

positions. Using Bayes' theorem we can express the probability of a set of model parameters being true given the data 
by: 

Pr(6>|D) cx Pr(r>|0)Pr(0) (Dl) 

In the above equation the term Pt{D\9) is the likelihood of the data being true given a set of model parameters and 
Pt{9) is the prior information we have about the model parameters. Our model parameters and their priors are listed 
in Table m 



The first term on the right of Equation ID II is a product of likelihoods for individual host systems: 

PT{D\9) = ]lPT{d,\9,h,) (D2) 
j 

Notice that in our model, the measurements of the host light profile are treated as model parameters which we assume 
are known exactly. This is because we have constructed our model such that its parameters do not predict the 
properties of the host light profile. 

Equation ID2I is itself composed of a product of the likelihood of measuring N°^^ objects around each host times the 
probability for measuring each object position, 

Pridj\9,hj) = PT{Nf^\9)Y[PT{x,\9,hj) (D3) 

i 

Recall that in Section [5] we built a model that was composed of separate contributions from a satellite and background 
population. This means that the probability of finding any object at a given location is the sum of the probability of 
finding a background object at that location with the probability of a finding a satellite at that location: 

PTidj\9, hj) = Pr(A^°''"|6') ]^Pr(a;,|6>, /i^-, 5)Pr(S'|0, h^) + Pr(a;,|6', hj,B)PT{B\9, hj) (D4) 

i 

The term Pi{xi\9,hj, S) is the probability of finding a satellite at a certain position (given by Equation and 
PT{xi\9, hj, B) is the probability of finding a background/foreground object at a certain position (which is uniform). 
The term Pi{S\9, hj) is the probability of an object being a satellite given only the model parameters, 

and Pr{B\9, hj) is defined in an analogous way for the line-of-sight interlopers. The functions that determine Ns and 
TVf, in terms of model parameters are discussed in Section [3] 



